Tractable Group Detection on Large Link Data Sets
نویسندگان
چکیده
Discovering underlying structure from co-occurrence data is an important task in a variety of fields, including: insurance, intelligence, criminal investigation, epidemiology, human resources, and marketing. Previously Kubica et. al. presented the group detection algorithm (GDA) an algorithm for finding underlying groupings of entities from co-occurrence data. This algorithm is based on a probabilistic generative model and produces coherent groups that are consistent with prior knowledge. Unfortunately, the optimization used in GDA is slow, potentially making it infeasible for many large data sets. To this end, we present k-groups an algorithm that uses an approach similar to that of k-means to significantly accelerate the discovery of groups while retaining GDA’s probabilistic model. We compare the performance of GDA and k-groups on a variety of data, showing that k-groups’ sacrifice in solution quality is significantly offset by its increase in speed.
منابع مشابه
K-groups : tractable group detection on large link data sets
Discovering underlying structure from co-occurrence data is an important task in many fields, including: insurance, intelligence, criminal investigation, epidemiology, human resources, and marketing. For example a store may wish to identify underlying sets of items purchased together or a human resources department may wish to identify groups of employees that collaborate with each other. Previ...
متن کاملApplication of Recursive Least Squares to Efficient Blunder Detection in Linear Models
In many geodetic applications a large number of observations are being measured to estimate the unknown parameters. The unbiasedness property of the estimated parameters is only ensured if there is no bias (e.g. systematic effect) or falsifying observations, which are also known as outliers. One of the most important steps towards obtaining a coherent analysis for the parameter estimation is th...
متن کاملA Flexible Link Radar Control Based on Type-2 Fuzzy Systems
An adaptive neuro fuzzy inference system based on interval Gaussian type-2 fuzzy sets in the antecedent part and Gaussian type-1 fuzzy sets as coefficients of linear combination of input variables in the consequent part is presented in this paper. The capability of the proposed method (we named ANFIS2) for function approximation and dynamical system identification is remarkable. The structure o...
متن کاملEffectiveness of spectral data reduction in detection of salt-affected soils in a small study area
Data reduction is used to aggregate or amalgamate the large data sets into smaller and manageable information pieces in order to fast and accurate classification of different attributes. However, excessive spatial or spectral data reduction may result in losing or masking important radiometric information. Therefore, we conducted this research to evaluate the effectiveness of the different...
متن کاملFDG-PET/MRI fused data sets for the detection of liver metastases in patients undergoing systemic anticancer treatment
Background: To retrospectively describe imaging characteristics of liver metastases on fused FDG-PET/ MRI data sets and to compare the diagnostic accuracy of MRI and fused FDG-PET/MRI data sets for the detection of liver metastases in patients undergoing systemic anticancer treatment. Materials and Methods: 43 oncological patients (mean age: 56+/- 11 years) were investigated by FDG-PET...
متن کامل